21 research outputs found

    Reactive search for MAX-SAT: diversification-bias properties with prohibitions and penalties

    Get PDF
    Many incomplete approaches for SAT and MAX-SAT have been proposed in the last years. The objective of this investigation is not so much horseracing (beating the competition on selected benchmarks) but understanding the qualitative differences between the various approaches by analyzing simplified versions thereof. In particular, we focus on reactive search schemes where task-dependent and local properties in the configuration space are used for the dynamic on-line tuning of local search parameters. We consider the choice between prohibition-based and penalty-based reactive approaches, and the choice between considering all variables or only the variables appearing in unsatisfied clauses. On simplified versions we consider the trade-off between diversification and bias after starting from a local minimizer, the so called D-B plots. We then consider long runs of the complete algorithms on selected MAX-SAT instances, by measuring both the number of iterations (flips) and the CPU times required by the single iterations with efficient data structures. The results confirm the effectiveness of reactive approaches, in particular when combined with non-oblivious objective functions. Furthermore a complex non-linear behavior of penalty-based schemes is observed

    Active learning of Pareto fronts

    Get PDF
    This work introduces the Active Learning of Pareto fronts (ALP) algorithm, a novel approach to recover the Pareto front of a multi-objective optimization problem. ALP casts the identification of the Pareto front into a supervised machine learning task. This approach enables an analytical model of the Pareto front to be built. The computational effort in generating the supervised information is reduced by an active learning strategy. In particular, the model is learnt from a set of informative training objective vectors. The training objective vectors are approximated Pareto-optimal vectors obtained by solving different scalarized problem instances. The experimental results show that ALP achieves an accurate Pareto front approximation with a lower computational effort than state-of-the-art Estimation of Distribution Algorithms and widely-known genetic techniques

    Joint Learning and Optimization of Unknown Combinatorial Utility Functions

    Get PDF
    This work considers the problem of automatically discovering the solution preferred by a decision maker (DM). Her preferences are formalized as a combinatorial utility function, but they are not fully defined at the beginning and need to be learnt during the search for the satisficing solution. The initial information is limited to a set of catalog features from which the decisional variables of the DM are to be selected. An interactive optimization procedure is introduced, which iteratively learns an approximation of the utility function modeling the quality of candidate solutions and uses it to generate novel candidates for the following refinement. The source of learning signals is the decision maker, who is fine-tuning her preferences based on the learning process triggered by the presentation of tentative solutions. The proposed approach focuses on combinatorial utility functions consisting of a weighted sum of conjunctions of predicates in a certain theory of interest. The learning stage exploits the sparsity-inducing property of 1-norm regularization to learn a combinatorial function from the power set of all possible conjunctions of the predicates up to a certain degree. The optimization stage consists of maximizing the learnt combinatorial utility function to generate novel candidate solutions. The maximization is cast into an Optimization Modulo Theory problem, a recent formalism allowing to efficiently handle both discrete and continuous-valued decisional features. Experiments on realistic problems demonstrate the effectiveness of the method in focusing towards the optimal solution and its ability to recover from suboptimal initial choices

    A Reactive Search Optimization approach to interactive decision making

    Get PDF
    Reactive Search Optimization (RSO) advocates the integration of learning techniques into search heuristics for solving complex optimization problems. In the last few years, RSO has been mostly employed in self-adapting a local search method in a manner depending on the previous history of the search. The learning signals consisted of data about the structural characteristics of the instance collected while the algorithm is running. For example, data about sizes of basins of attraction, entrapment of trajectories, repetitions of previously visited configurations. In this context, the algorithm learns by interacting from a previously unknown environment given by an existing (and fixed) problem definition. This thesis considers a second interesting online learning loop, where the source of learning signals is the decision maker, who is fine-tuning her preferences (formalized as an utility function) based on a learning process triggered by the presentation of tentative solutions. The objective function and, more in general, the problem definition is not fully stated at the beginning and needs to be refined during the search for a satisfying solution. In practice, this lack of complete knowledge may occur for different reasons: insufficient or costly knowledge elicitation, soft constraints which are in the mind of the decision maker, revision of preferences after becoming aware of some possible solutions, etc. The work developed in the thesis can be classified within the well known paradigm of Interactive Decision Making (IDM). In particular, it considers interactive optimization from a machine learning perspective, where IDM is seen as a joint learning process involving the optimization component and the DM herself. During the interactive process, on one hand, the decision maker improves her knowledge about the problem in question and, on the other hand, the preference model learnt by the optimization component evolves in response to the additional information provided by the user. We believe that understanding the interplay between these two learning processes is essential to improve the design of interactive decision making systems. This thesis goes in this direction, 1) by considering a final user that may change her preferences as a result of the deeper knowledge of the problem and that may occasionally provide inconsistent feedback during the interactive process, 2) by introducing a couple of IDM techniques that can learn an arbitrary preference model in these changing and noisy conditions. The investigation is performed within two different problems settings, the traditional multi-objective optimization and a constraint-based formulation for the DM preferences. In both cases, the ultimate goal of the IDM algorithm developed is the identification of the solution preferred by the final user. This task is accomplished by alternating a learning phase generating an approximated model of the user preferences with an optimization stage identifying the optimizers of the current model. Current tentative solutions will be evaluated by the final user, in order to provide additional training data. However, the cognitive limitations of the user while analyzing the tentative solutions demands to minimize the amount of elicited information. This requires a shift of paradigm with respect to standard machine learning strategies, in order to model the relevant areas of the optimization surface rather than reconstruct it entirely. In our approach the shift is obtained both by the application of well known active learning principles during the learning phase and by the suitable trade-off among diversification and intensification of the search during the optimization stage
    corecore